skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Kalita, Jugal"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The improvement of language model robustness, including successful defense against adversarial attacks, remains an open problem. In computer vision settings, the stochastic noising and de-noising pro- cess provided by diffusion models has proven useful for purifying input images, thus improving model robustness against adversarial attacks. Similarly, some initial work has explored the use of random noising and de-noising to mitigate adversarial attacks in an NLP setting, but im- proving the quality and efficiency of these methods is necessary for them to remain competitive. We extend upon methods of input text purifica- tion that are inspired by diffusion processes, which randomly mask and refill portions of the input text before classification. Our novel method, MaskPure, exceeds or matches robustness compared to other contempo- rary defenses, while also requiring no adversarial classifier training and without assuming knowledge of the attack type. In addition, we show that MaskPure is provably certifiably robust. To our knowledge, MaskPure is the first stochastic-purification method with demonstrated success against both character-level and word-level attacks, indicating the gen- eralizable and promising nature of stochastic denoising defenses. In sum- mary: the MaskPure algorithm bridges literature on the current strongest certifiable and empirical adversarial defense methods, showing that both theoretical and practical robustness can be obtained together. Code is available on GitHub. 
    more » « less
  2. Controlled text generation (CTG) seeks to guide large language model (LLM) output to produce text that conforms to desired criteria. The current study presents a novel CTG al- gorithm that enforces adherence toward spe- cific rhetorical relations in an LLM sentence- completion context by a parser-driven decoding scheme that requires no model fine-tuning. The method is validated both with automatic and human evaluation. The code is accessible on GitHub. 
    more » « less
  3. The increased prevalence of online meetings has significantly en- hanced the practicality of a model that can automatically generate the summary of a given meeting. This paper introduces a novel and effective approach to automate the generation of meeting sum- maries. Current approaches to this problem generate general and basic summaries, considering the meeting simply as a long dia- logue. However, our novel algorithms can generate abstractive meet- ing summaries that are driven by the action items contained in the meeting transcript. This is done by recursively generating sum- maries and employing our action-item extraction algorithm for each section of the meeting in parallel. All of these sectional sum- maries are then combined and summarized together to create a coherent and action-item-driven summary. In addition, this paper introduces three novel methods for dividing up long transcripts into topic-based sections to improve the time efficiency of our al- gorithm, as well as to resolve the issue of large language models (LLMs) forgetting long-term dependencies. Our pipeline achieved a BERTScore of 64.98 across the AMI corpus, which is an approxi- mately 4.98% increase from the current state-of-the-art result pro- duced by a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model. 
    more » « less
  4. The increased prevalence of online meetings has significantly en- hanced the practicality of a model that can automatically generate the summary of a given meeting. This paper introduces a novel and effective approach to automate the generation of meeting sum- maries. Current approaches to this problem generate general and basic summaries, considering the meeting simply as a long dialogue. However, our novel algorithms can generate abstractive meeting summaries that are driven by the action items contained in the meet- ing transcript. This is done by recursively generating summaries and employing our action-item extraction algorithm for each sec- tion of the meeting in parallel. All of these sectional summaries are then combined and summarized together to create a coherent and action-item-driven summary. In addition, this paper introduces three novel methods for dividing up long transcripts into topic- based sections to improve the time efficiency of our algorithm, as well as to resolve the issue of large language models (LLMs) forget- ting long-term dependencies. Our pipeline achieved a BERTScore of 64.98 across the AMI corpus, which is an approximately 4.98% increase from the current state-of-the-art result produced by a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model. 
    more » « less
  5. Given vector representations for individual words, it is necessary to compute vector representations of sentences for many applications in a compositional manner, often using artificial neural networks. Relatively little work has explored the internal structure and properties of such sentence vectors. In this paper, we explore the properties of sentence vectors in the context of automatic summarization. In particular, we show that cosine similarity between sentence vectors and document vectors is strongly correlated with sentence importance and that vector semantics can identify and correct gaps between the sentences chosen so far and the document. In addition, we identify specific dimensions which are linked to effective summaries. To our knowledge, this is the first time specific dimensions of sentence embeddings have been connected to sentence properties. We also compare the features of different methods of sentence embeddings. Many of these insights have applications in uses of sentence embeddings far beyond summarization. 
    more » « less
  6. Recent papers in neural machine translation have proposed the strict use of attention mechanisms over previous stan- dards such as recurrent and convolutional neural networks (RNNs and CNNs). We propose that by running traditionally stacked encoding branches from encoder-decoder attention- focused architectures in parallel, that even more sequential operations can be removed from the model and thereby de- crease training time. In particular, we modify the recently published attention-based architecture called Transformer by Google, by replacing sequential attention modules with par- allel ones, reducing the amount of training time and substan- tially improving BLEU scores at the same time. Experiments over the English to German and English to French translation tasks show that our model establishes a new state of the art. 
    more » « less
  7. In a world of proliferating data, the abil- ity to rapidly summarize text is grow- ing in importance. Automatic summariza- tion of text can be thought of as a se- quence to sequence problem. Another area of natural language processing that solves a sequence to sequence problem is ma- chine translation, which is rapidly evolv- ing due to the development of attention- based encoder-decoder networks. This work applies these modern techniques to abstractive summarization. We perform analysis on various attention mechanisms for summarization with the goal of devel- oping an approach and architecture aimed at improving the state of the art. In par- ticular, we modify and optimize a trans- lation model with self-attention for gener- ating abstractive sentence summaries. The effectiveness of this base model along with attention variants is compared and ana- lyzed in the context of standardized eval- uation sets and test metrics. However, we show that these metrics are limited in their ability to effectively score abstractive summaries, and propose a new approach based on the intuition that an abstractive model requires an abstractive evaluation. 
    more » « less
  8. Many challenges in natural language pro- cessing require generating text, including language translation, dialogue generation, and speech recognition. For all of these problems, text generation becomes more difficult as the text becomes longer. Cur- rent language models often struggle to keep track of coherence for long pieces of text. Here, we attempt to have the model construct and use an outline of the text it generates to keep it focused. We find that the usage of an outline improves perplex- ity. We do not find that using the outline improves human evaluation over a simpler baseline, revealing a discrepancy in per- plexity and human perception. Similarly, hierarchical generation is not found to im- prove human evaluation scores. 
    more » « less